

Advanced Technical Skills (ATS) North America

# CPU MF - the "Lucky" 113s - z196 Update and WSC Experiences

**SHARE** Session 7717

August 4, 2010

John Burg jpburg@us.ibm.com IBM





## Trademarks

The following are trademarks of the International Business Machines Corporation in the United States and/or other countries.

| AlphaBlox*                                 | GDPS*                                      | RACF*                                         | Tivoli*                |
|--------------------------------------------|--------------------------------------------|-----------------------------------------------|------------------------|
| APPN*                                      | HiperSockets                               | Redbooks*                                     | Tivoli Storage Manager |
| CICS*                                      | HyperSwap                                  | Resource Link                                 | TotalStorage*          |
| CICS/VSE*                                  | IBM*                                       | RETAIN*                                       | VSE/ESA                |
| Cool Blue                                  | IBM eServer                                | REXX                                          | VTAM*                  |
| DB2*                                       | IBM logo*                                  | RMF                                           | WebSphere*             |
| DFSMS                                      | IMS                                        | S/390*                                        | zEnterprise            |
| DFSMShsm                                   | Language Environment*                      | Scalable Architecture for Financial Reporting | xSeries*               |
| DFSMSrmm                                   | Lotus*                                     | Sysplex Timer*                                | z9*                    |
| DirMaint                                   | Large System Performance Reference™ (LSPR™ | ) Systems Director Active Energy Manager      | z10                    |
| DRDA*                                      | Multiprise*                                | System/370                                    | z10 BC                 |
| DS6000                                     | MVS                                        | System p*                                     | z10 EC                 |
| DS8000                                     | OMEGAMON*                                  | System Storage                                | z/Architecture*        |
| ECKD                                       | Parallel Sysplex*                          | System x*                                     | z/OS*                  |
| ESCON*                                     | Performance Toolkit for VM                 | System z                                      | z/VM*                  |
| FICON*                                     | PowerPC*                                   | System z9*                                    | z/VSE                  |
| FlashCopy*                                 | PR/SM                                      | System z10                                    | zSeries*               |
| * Registered trademarks of IBM Corporation | Processor Resource/Systems Manager         |                                               |                        |

#### The following are trademarks or registered trademarks of other companies.

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.

IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.

\* All other products may be trademarks or registered trademarks of their respective companies.

#### Notes:

Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.

All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.

This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.

All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

## **Other Related Presentations**

- zPCR Capacity Sizing Lab Part 1 Introduction & Overview Wed 1:30 PM
- zPCR Capacity Sizing Lab Part 2 Hands on Lab Wed 3 PM
- To MIPS or Not to MIPS Thursday 9:30
- Lunch and Learn: <u>The All New LSPR and z196</u> Thursday 12:15
- Framework For Doing Capacity Sizing on System z Thursday 1:30 PM
- IBM Smart Analytics Optimizer Thursday 3 PM



# Topics

#### CPU MF Introduction

- What it is, how to enable,
- New support for z10s and zEnterprise 196 (z196)
  - Including Sync Interval and Identification of Processor Type (e.g. GCP, zAAP, zIIP)

#### Workload Characterization Update

• Step 1 completed

#### Key Performance Metrics for z10s and z196s

- CPI, Problem State, Cache / Memory Hierarchy
- New metrics and formulas

#### WSC Customer Experiences with SMF 113s

- Lessons Learned Summary
- HiperDispatch=No/Yes
- DB2 10 for z/OS Beta 1MB Page Buffer pools

#### Summary



## **CPU** Measurement Facility Introduction

## What is the z10 CPU Measurement Facility

#### • New hardware instrumentation facility "CPU Measurement Facility" (CPU MF)

- Available on System z10 GA2 (EC and BC) and z196
- Supported by a new z/OS component (Instrumentation), Hardware Instrumentation Services (HIS)

#### Potential Uses – for this new "cool" virtualization technology

- COUNTERS
  - Supplement Current Performance Metrics
  - Workload characterization
- SAMPLING
  - ISV product improvement
  - Application Tuning

#### IBM Research article

- "IBM System z10 performance improvements with software & hardware synergy"
- <u>http://www.research.ibm.com/journal/rd/531/jackson.pdf</u>
- Contact IBM team for copy of the article



### Requirements and Steps to utilize z10 and z196 CPU MF

#### Requirements for CPU MF

- z196 or System z10 machine
  - z10 must be at GA2 Driver 76D Bundle #20 or higher
- z10 z/OS LPAR being measured must be at z/OS 1.8 or higher with APARs:
  - OA25755, OA25750, and OA25773 also OA30486 for z/OS 1.10 and higher for new functionality
  - OA27623 also recommended to add "CPU Speed" to SMF 113s and HIS COUNTERS output
  - Not currently supported for z/OS running as a z/VM guest z/VM native prototype support in process
- z196 z/OS LPARs being measured at z/OS 1.9 or higher require APAR OA30486
  - z/OS 1.8 requires OA33052

# Steps to utilize CPU MF Operationally CPU MF works the same on z196 Configure the z10 or z106 to collect CPU ME Date

- Configure the z10 or z196 to collect CPU MF Data
  - Update LPAR Security Tabs (See appendix)
- Configure HIS on z/OS to collect CPU MF Data
  - Set up HIS Proc
  - Set up OMVS Directory
  - Collect SMF 113s via SMFPRMxx
- Collect CPU MF Data
  - Start HIS Modify with Begin/End for COUNTERS or SAMPLING
  - "F HIS,B,TT='Text',PATH='/his/',CTRONLY,CTR=ALL
- Analyze the CPU MF Data
  - SMF 113s

CPU MF has a very low overhead to run, is easy to implement, and is a very small SMF record

//HIS PROC

//HIS EXEC PGM=HISINIT,REGION=0K,TIME=NOLIMIT //SYSPRINT DD SYSOUT=\*

Remember CTR=ALL to get Extended Counters!



### New HIS support for Sync Interval, PU Type and STATECHANGE

- APAR OA30486 with z/OS 1.12 GA will be rolled down to z/OS V1R11 and z/OS V1R10
  - Applicable for <u>z10s</u> and <u>z196s</u> for new functionality
  - New CPU MF capability to sync SMF 113s with other SMF records
    - SMFINTVAL=SYNC
      - Synchronize records with the SMF global recording interval
    - ...or choose Interval time 1-60
    - Recommendation is "SYNC":

Recommend SMFINTVAL=SYNC or SI=SYNC

- "F HIS,B,TT='Text',PATH='/his/',CTRONLY,CTR=ALL,SMFINTVAL=SYNC "
- Identification of PU Type (GCP, zIIP or zAAP) in SMF 113 record
  - SMF113\_2\_CpuProcClass '0 '- GCP / '2' zAAP / '4' zIIP
- STATECHANGE
- Both SMFINTERVAL and STATECHANGE <u>can be abbreviated</u>, e,g, SI=SYNC, SC=SAVE
   "F HIS,B,TT='Text',PATH='/his/',CTRONLY,CTR=ALL,SI=SYNC,SC=SAVE "
- In SMF 113s the z196 processor is identified by
   SMF113\_2\_CTRVN2 = '2' for z196, '1' for z10

z196 Extended Counters have changed, use CTRVN2 to determine if z10 or z196

## HIS STATECHANGE

#### HIS detects and handles significant hardware events (state change)

- Replacement Capacity (Customer Initiated Upgrade)
- On/Off Capacity on demand
- How HIS reacts depends on the STATECHANGE parameter specified
  - STATECHANGE=STOP
    - Stop the collection run when the event was detected
  - STATECHANGE=IGNORE
    - Continue the collection run as if the event never happened
  - STATECHANGE=SAVE (Default)
    - Record the previous state of the system (Save all data)
      - Write and close the .CNT file
      - Close all .SMP files (1 per CPU)
      - Cut SMF Type 113 Records (1 per CPU)
    - Continue the collection run with the new state
      - Create new .SMP files (1 per CPU)
      - Cut SMF Type 113 Records (1 per CPU)

#### STATECHANGE information not directly reported in the SMF 113

You will see additional record(s) and an increase/decrease in CPIDs or "CPU Speed"

Recommend STATECHANGE=SAVE, (the default) so don't need to specify

Verify with SMF 113s that "CPU Speed" or "Effective GHz" changed as expected





## New HIS APAR OA30486 support for z196 – WSC Example

15:33:13.24 JPBURG 00000200 F HIS,B,TT='Z196 w/ TEST',CTRONLY,CTR=ALL,SMFINTVAL=SYNC 15:33:14.22 STC01226 00000000 HIS0111 HIS DATA COLLECTION STARTED

| Time Stamp CPU #        | Ср | ouProcClass CTNVN1        | С | TNVN2 CPSP |      |           |
|-------------------------|----|---------------------------|---|------------|------|-----------|
| 10 JUL 22 15:33:14.22   | 0  | 0                         | 1 | 2          | 5208 | '5.2' GHz |
| 10 JUL 22 15:33:14.22   | 1  | 0                         | 1 | 2          | 5208 |           |
| 10 JUL 22 15:33:14.22   | 4  | 4                         | 1 | 2          | 5208 | -400      |
| 10 JUL 22 15:33:14.22   | 5  | 4                         | 1 | 2          | 5208 | z196      |
| 10 JUL 22 15:35:00.00   | 0  | " <b>0</b> " <b>GCP</b> 0 | 1 | 2          | 5208 | ţ         |
| 10 JU/L 22 15:35:00.00  | 1  | $\longrightarrow 0$       | 1 | 2          | 5208 |           |
| 10 JŲL 22 15:35:00.00 ) | 4  | '4'-zⅡP 4                 | 1 | 2          | 5208 |           |
| 10 JUL 22 15:35:00.00   | 5  | 4                         | 1 | 2          | 5208 |           |
|                         |    |                           |   | Ţ          |      |           |
|                         |    |                           |   | '2' z19    | 6    |           |

SMF 113 Synched with SMF Global Recording Interval - 5 Minutes



## New HIS APAR OA30486 support for z196 – WSC Example

12:19:11.82 JPBURG 00000200 F HIS,B,TT='Z196 TEST', CTRONLY,CTR=ALL,SMFINTVAL=1

12:19:11.82 STC32434 00000000 HIS0111 HIS DATA COLLECTION STARTED



### SMF 113 Written every 1 Minute



## Workload Characterization Update



- Historically, <u>LSPR workload capacity curves (primitives and mixes) have had</u> <u>application names or been identified by a "software" captured characteristic</u> – For example, CICS, IMS, OLTP-T, CB-L, LoIO-mix, TI-mix, etc
- However, capacity performance is more closely associated with <u>how a workload is</u> using and interacting with a processor "hardware" design
- With the availability of <u>CPU MF (SMF 113) data on z10</u>, the ability to gain <u>insight into</u> the interaction of workload and hardware has arrived
- The <u>knowledge gained is still evolving</u>, but the <u>first step in the process is to produce</u> <u>LSPR workload capacity curves</u> based on the underlying hardware sensitivities
- Thus, the <u>LSPR for z196 will introduce three new workload categories</u> which replace all prior primitives and mixes
  - Based on new hardware defined metric called **<u>Relative Nest Intensity</u>**
  - Low, Average, High (Relative Nest Intensity)
- To simplify the transition, <u>an easy and automatic translation of old names to new</u> <u>categories will be supplied in zPCR</u>
  - For example, if you have been using LoIO-mix in your studies, you will simply use the new "Average" workload in the future

#### Instruction Complexity (Micro processor design)

- Many design alternatives
  - Cycle time (GHz), instruction architecture, pipeline, superscalar, Out-Of-Order, branch prediction and more
- Workload effect
  - May be different with each processor design
  - But once established for a workload on a processor, doesn't change very much

#### Memory Hierarchy or "Nest"

- Many design alternatives
  - Cache (levels, size, private, shared, latency, MESI protocol), controller, data buses
- Workload effect
  - Quite variable
  - Sensitive to many factors: locality of reference, dispatch rate, IO rate, competition with other applications and/or LPARs, and more
  - Net effect of these factors represented in "Relative Nest Intensity"
- Relative Nest Intensity (RNI)
  - Activity beyond private-on-chip cache(s) is the most sensitive area
  - Reflects distribution and latency of sourcing from shared caches and memory
  - Level 1 cache miss percentage also important
  - Data for calculation available from CPU MF (SMF 113) starting with z10







## **CPU MF**

## **z10 Customer Workload Characterization Summary**





## Workload Characterization Future Vision – Step 1 is Complete

- Future vision to help identify workload characteristics and to provide better input for capacity planning and performance
  - Step 1 Created Workload Categories from SMF 113s complete
    - Over 150 z10 Customer/Partitions have participated thru 8/1. Thank You!
    - Measured LSPR with these new Categories
  - Step 2 Refine Workload Selection Process
    - As you move to z196 from z10, looking for "Before" and " After volunteers

Still Looking for "Volunteers" – (3 days, 24 hours/day, SMF 70s, 71s, 72s, 113s per LPAR) "Before z10" and "After z196"

If interested send note to <u>ipburg@us.ibm.com</u>, No deliverable will be returned

Benefit: Opportunity to ensure your data is used to influence analysis

Recommend Capturing CPU MF SMF 113 Records on z10s

- What is CPU MF?
  - A new z10 and z196 capability to measure cache / memory hierarchy characteristics
- How can it be used today?
  - To supplement current performance metrics (e.g. from SMF, RMF, DB2, CICS)
  - As a secondary data source to understand why performance may have changed
- What can it be used for in z196 capacity planning?

  - Capacity Sizing process is the same as today with zPCR
    Based on DASD I/Os per MSU consumed
    And optionally use a new Relative Nest Intensity "Hint"
    CP3KEXTR will process SMF 113s and include in EDF file for zPCR
  - SMF 113 data may prove useful in support of an installation of z196
- What CPU MF is not
  - It is **Not** a substitute for traditional performance nor capacity metrics
  - It does **Not** indicate the capacity being achieved by the LPAR or processor
- Recommend Enabling CPU MF COUNTERS on key z10 production partitions
  - See CPU MF Overview and WSC Experiences Techdoc TC000041
    - http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TC000041
    - Overview presentation and a white paper on how to enable CPU MF COUNTERS



## Key Performance Metrics for z10s and z196s

#### z196 versus z10 hardware comparison

L2

L1

- z10 EC
  - ► CPU
    - -44 GHz
  - Caches
    - -L1 private 64k i, 128k d
    - -L1.5 private 3 MB
    - -L2 shared 48 MB / book
    - -book interconnect: star





- CPU
  - -52 GHz
  - Out-Of-Order execution
- Caches
  - L1 private 64k i, 128k d
  - L2 private 1.5 MB
  - L3 shared 24 MB / chip
  - I 4 shared 192 MB / book
  - -book interconnect: star





# CPU MF and HIS provide a z/OS logical view Resource Usage and Cache Hierarchy Sourcing



## LPAR / Logical CP view:

•Memory Accesses

•Cache

- •L 2 / (L4 z196) Accesses (local and remote)
- •L3 Accesses on z196
- •L1.5 / (L2 z196) Accesses
- •L1 Sourced from Hierarchy
- •Instructions and Cycles
- •Crypto function

# Current CPU MF Key Performance Metrics:

| CPI | PRBSTATE | L1MP | L15P | L2LP | L2 RP | MEMP | LPARCPU |
|-----|----------|------|------|------|-------|------|---------|

- **CPI Cycles per Instruction**
- **PRBSTATE % Problem State**
- L1MP Level 1 Miss %
- L15P % sourced from L1.5 cache
- L2LP % sourced from Level 2 Local cache (on same book)
- L2RP % sourced from Level 2 Remote cache (on different book)
- **MEMP % sourced from Memory**
- LPARCPU APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured

Workload Characterization L1 Sourcing from cache/memory hierarchy

#### Introducing the new Relative Nest Intensity (RNI) metric

- <u>Relative Nest Intensity</u> reflects the distribution and latency of sourcing from shared caches and memory
  - For z10 Technology the Relative Nest Intensity = (L2LP \* 1 + L2RP \* 2.4 + MEMP \* 7.5) / 100



**Microprocessor Design** 

Memory Hierarchy or Nest





Nest Memory Hierarchy or CPI Microprocessor design L 1.5 **Complexity-**Instruction L 1 CPU CPU CPU CPU CPU CPU PR/SM

#### Updated z10 CPU MF Workload Characterization Summary

| CustomerSYSID MONDAYCPIPRBSTATEEst InstrCmplxCPISCPL 1ML1MPL15PL2LPL2RPMEMPRel NestLPARCPUEffAll VolunteersMinimum3.11.12.10.959.61.348.65.60.02.20.414.4All VolunteersAverage7.231.23.23.9101.43.968.921.21.68.30.9376.3All VolunteersMaximum12.067.15.68.6194.96.982.832.96.920.21.81442.34.40New z10 columns arePrb State - % Problem State1.Est Instr Cmplx CPI2.Est Finite CPI3.Est SCPL1M4.Rel Nest Intensity4.Rel Nest Intensity5.Eff GHz5.Eff GHz                                                                                                                                                                    |         |       |             | 1         |                   | Ļ     | Ļ                                                                        |           |        |        |          |         |        |        |          |        |      |      |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|-------|-------------|-----------|-------------------|-------|--------------------------------------------------------------------------|-----------|--------|--------|----------|---------|--------|--------|----------|--------|------|------|
| All VolunteersMinimum3.11.12.10.959.61.348.65.60.02.20.414.4All VolunteersAverage7.231.23.23.9101.43.968.921.21.68.30.9376.3All VolunteersMaximum12.067.15.68.6194.96.982.832.96.920.21.81442.34.40New z10 columns areCPI – Cycles per InstructionPrb State - % Problem State1.Est Instr Cmplx CPIEst Instr Cmplx CPIEst Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)2.Est Finite CPIEst Finite CPI – Estimated CPI from Finite cache/memory3.Est SCPL1MEst SCPL1M – Estimated Sourcing Cycles per Level 1 Miss4.Rel Nest IntensityL1MP – Level 1 Miss %5.Eff GHzL15P – % sourced from Level 2 cache | Custome | er    | SYSID MON   | DAY CF    |                   |       |                                                                          |           | L1MP   | L15P   | L        | _2LP    | L2RP   | MEMP   |          | LPARC  |      |      |
| All VolunteersMaximum12.067.15.68.6194.96.982.832.96.920.21.81442.34.40New z10 columns areCPI - Cycles per Instruction1.Est Instr Cmplx CPI2.Est Finite CPIEst Instr Cmplx CPI - Estimated Instruction Complexity CPI (infinite L1)2.Est SCPL1M3.Est SCPL1MEst SCPL1M - Estimated CPI from Finite cache/memory4.Rel Nest IntensityEst SCPL1M - Estimated Sourcing Cycles per Level 1 Miss5.Eff GHzL15P - % sourced from Level 2 cache                                                                                                                                                                                        |         |       | Minim       | um 3      | 3.1 1.1           | 2.1   |                                                                          |           | \      |        |          | 5.6     | 0.0    | 2.2    | 0.4      | -      |      |      |
| New z10 columns areCPI – Cycles per Instruction1.Est Instr Cmplx CPI2.Est Finite CPI3.Est SCPL1M4.Rel Nest Intensity5.Eff GHz                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |         |       |             | 0 -       | (                 |       |                                                                          |           | 1      |        |          |         |        |        | \<br>\   |        |      |      |
| New z10 columns arePrb State - % Problem State1.Est Instr Cmplx CPIEst Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)2.Est Finite CPIEst Finite CPI – Estimated CPI from Finite cache/memory3.Est SCPL1MEst SCPL1M – Estimated Sourcing Cycles per Level 1 Miss4.Rel Nest IntensityL1MP – Level 1 Miss %5.Eff GHzL15P – % sourced from Level 2 cache                                                                                                                                                                                                                                                   |         | teers | Waxim       | ium 12    | 2.0 67.1          | 5.0   | 8.6                                                                      | 194.9     | 6.9    | 82.8   |          | 32.9    | 6.9    | 20.2   | 1.8      | 14     | 42.3 | 4.40 |
| New z10 columns arePrb State - % Problem State1.Est Instr Cmplx CPIEst Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)2.Est Finite CPIEst Finite CPI – Estimated CPI from Finite cache/memory3.Est SCPL1MEst SCPL1M – Estimated Sourcing Cycles per Level 1 Miss4.Rel Nest IntensityL1MP – Level 1 Miss %5.Eff GHzL15P – % sourced from Level 2 cache                                                                                                                                                                                                                                                   |         |       |             |           |                   |       |                                                                          |           |        |        |          |         |        |        |          |        |      |      |
| Prb State - % Problem State1.Est Instr Cmplx CPI2.Est Finite CPI3.Est SCPL1M4.Rel Nest Intensity5.Eff GHz                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | No      |       |             |           |                   | CPI   | <ul> <li>Cycle</li> </ul>                                                | s per In  | struct | ion    |          |         |        |        |          |        |      |      |
| 2.Est Finite CPIEst Finite CPI – Estimated Instruction Complexity CPI (Infinite L1)2.Est Finite CPIEst Finite CPI – Estimated CPI from Finite cache/memory3.Est SCPL1MEst SCPL1M – Estimated Sourcing Cycles per Level 1 Miss4.Rel Nest IntensityL1MP – Level 1 Miss %5.Eff GHzL15P – % sourced from Level 2 cache                                                                                                                                                                                                                                                                                                           |         |       |             |           |                   | Prb   | State - <sup>c</sup>                                                     | % Probl   | em St  | tate   |          |         |        |        |          |        |      |      |
| <ul> <li>3. Est SCPL1M</li> <li>4. Rel Nest Intensity</li> <li>5. Eff GHz</li> <li>Est Ninte Cl 1 – Estimated Of Filohin Finite Cache/memory</li> <li>Est SCPL1M – Estimated Sourcing Cycles per Level 1 Miss</li> <li>L1MP – Level 1 Miss %</li> <li>L15P – % sourced from Level 2 cache</li> </ul>                                                                                                                                                                                                                                                                                                                         | 1.      | E     | st Instr C  | mplx C    | PI                | Est   | Est Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1) |           |        |        |          |         |        |        |          |        |      | 1)   |
| <ul> <li>4. Rel Nest Intensity</li> <li>5. Eff GHz</li> <li>L1MP – Level 1 Miss %</li> <li>L15P – % sourced from Level 2 cache</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 2.      | E     | st Finite 0 | CPI       |                   | Est   | Est Finite CPI – Estimated CPI from Finite cache/memory                  |           |        |        |          |         |        |        |          |        |      |      |
| <b>5.</b> Eff GHzL15P – % sourced from Level 2 cache                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 3.      | E     | st SCPL1    | Μ         |                   | Est   | SCPL1N                                                                   | ∕I – Esti | mated  | d Sou  | ircing   | Cycl    | les pe | er Lev | el 1 Mis | S      |      |      |
| 5. Eff GHz L15P – % sourced from Level 2 cache                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 4       | R     | el Nest In  | tensity   | /                 | L1M   | P – Lev                                                                  | el 1 Mis  | ss %   |        |          |         |        |        |          |        |      |      |
| 5. ETT GHZ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |         |       |             | tonony    | /                 | 1 1 5 |                                                                          | ourood    | from I | ovol   | 2 000    | ho      |        |        |          |        |      |      |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 5.      | E     | ff GHz      |           |                   | LISI  | $L_{10} = 70$ Sourceu IIOIII Level 2 Cacile                              |           |        |        |          |         |        |        |          |        |      |      |
| L2LP – % sourced from Level 2 Local cache (on same book)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |         |       |             |           |                   | L2LI  | L2LP – % sourced from Level 2 Local cache (on same book)                 |           |        |        |          |         |        |        |          |        |      |      |
| L2RP – % sourced from Level 2 Remote cache (on different book)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |         |       |             |           |                   | L2R   | L2RP – % sourced from Level 2 Remote cache (on different book)           |           |        |        |          |         |        |        |          |        |      |      |
| MEMP - % sourced from Memory                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |         |       |             |           |                   | MEN   | MEMP - % sourced from Memory                                             |           |        |        |          |         |        |        |          |        |      |      |
| Rel Nest Intensity – Reflects distribution and latency of sourcing from                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |         |       |             |           |                   | Rel   | Nest Int                                                                 | ensitv –  | Refle  | ects d | listribu | ution   | and    | latenc | v of sou | urcina | from |      |
| Workload Characterization shared caches and memory                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |         |       |             |           |                   | shar  |                                                                          | •         |        |        | _        |         |        |        | ,        | 9      | -    |      |
| L1 Sourcing from cache/memory hierarchy<br>LPARCPU - APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |         |       | L1 Sourcing | trom cach | e/memory hier arc | LPA   | RCPU -                                                                   | APPL      | % (GC  | Ps, z  | zAAPs    | s, zIII | Ps) c  | apture | ed and u | Incapt | ured |      |
| Eff GHz – Effective gigahertz for GCPs, cycles per nanosecond                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |         |       |             |           |                   | Eff ( | GHz – E                                                                  | ffective  | gigah  | ertz f | for GC   | CPs,    | cycle  | es per | nanose   | cond   |      |      |



#### WSC z196 Sample CPU MF from July – 5 Minute Synched Intervals

| <b>z196</b> |     |        |       |      |           |                    |            |        |      |      |      |      |      |      |           |         |         |
|-------------|-----|--------|-------|------|-----------|--------------------|------------|--------|------|------|------|------|------|------|-----------|---------|---------|
|             |     |        |       |      |           | Est Instr<br>Cmplx | Est Finite | Est    |      |      |      |      |      |      | Rel Nest  |         |         |
| SYSID       | Mon | Day SH | Hour  | CPI  | Prb State | CPI                | CPI        | SCPL1M | L1MP | L2F  | L3P  | L4LP | L4RP | MEMP | Intensity | LPARCPU | Eff GHz |
| SYSD        | JUL | 22 N   | 17.25 | 3.65 | 2.3       | 2.70               | 0.95       | 26     | 3.7  | 77.8 | Z0.5 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.33 | 3.68 | 2.3       | 2.73               | 0.95       | 26     | 3.6  | 77.4 | 20.8 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.42 | 3.67 | 2.3       | 2.72               | 0.95       | 26     | 3.7  | 78.0 | 20.3 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.50 | 3.64 | 2.3       | 2.71               | 0.93       | 26     | 3.6  | 77.8 | 20.5 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.58 | 3.66 | 2.3       | 2.72               | 0.94       | 26     | 3.6  | 77.9 | 20.4 | 0.8  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.67 | 3.65 | 2.3       | 2.72               | 0.94       | 26     | 3.6  | 77.0 | 21.1 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.75 | 3.66 | 2.3       | 2.72               | 0.94       | 26     | 3.6  | 77.4 | 20.8 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.83 | 3.64 | 2.3       | 2.70               | 0.94       | 26     | 3.6  | 77.1 | 21.0 | 0.9  | 0.2  | 0.7  | 0.24      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 17.92 | 2.78 | 49.2      | 2.06               | 0.72       | 34     | 2.1  | 76.7 | 18.3 | 1.8  | 1.4  | 1.9  | 0.42      | 1.5     | 5.2     |
| SYSD        | JUL | 22 N   | 18.00 | 3.65 | 3.2       | 2.71               | 0.94       | 26     | 3.6  | 77.0 | 21.1 | 1.0  | 0.2  | 0.7  | 0.25      | 0.8     | 5.2     |
| SYSD        | JUL | 22 N   | 18.08 | 5.00 | 0.8       | 3.46               | 1.53       | 27     | 5.7  | 86.1 | 11.9 | 0.3  | 0.1  | 1.7  | 0.28      | 9.7     | 5.2     |
| SYSD        | JUL | 22 N   | 18.17 | 3.72 | 3.2       | 2.76               | 0.96       | 27     | 3.6  | 76.8 | 21.0 | 1.1  | 0.2  | 0.8  | 0.26      | 0.9     | 5.2     |
| SYSD        | JUL | 22 N   | 18.25 | 3.82 | 3.7       | 2.76               | 1.06       | 28     | 3.7  | 77.4 | 19.8 | 1.2  | 0.6  | 1.1  | 0.30      | 0.9     | 5.2     |

**CPI – Cycles per Instruction** 

Prb State - % Problem State

Est Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)

Est Finite CPI – Estimated CPI from Finite cache/memory

Est SCPL1M – Estimated Sourcing Cycles per Level 1 Miss

L1MP – Level 1 Miss %

L2P – % sourced from Level 2 cache

L3P – % sourced from Level 3 on same Chip cache

L4LP – % sourced from Level 4 Local cache (on same book)

L4RP - % sourced from Level 4 Remote cache (on different book)

MEMP - % sourced from Memory

CPU MF provides measurement of the z196 Level 3 shared cache

These numbers come from a synthetic Benchmark and do not represent a production workload

Rel Nest Intensity – Reflects distribution and latency of sourcing from shared caches and memory

LPARCPU - APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured

Eff GHz – Effective gigahertz for GCPs, cycles per nanosecond

Workload Characterization L1 Sourcing from cache/memory hierarchy

# Formulas – z10

Workload Characterization L1 Sourcing from cache/memory hierarchy

| Metric   | Calculation – note all fields are deltas between intervals                            |
|----------|---------------------------------------------------------------------------------------|
| CPI      | B0 / B1                                                                               |
| PRBSTATE | (P33 / B1) * 100                                                                      |
| L1MP     | ((B2+B4) / B1) * 100                                                                  |
| L15P     | ((E128+E129) / (B2+B4)) * 100                                                         |
| L2LP     | ((E130+E131) / (B2+B4)) * 100                                                         |
| L2RP     | ((E132+E133) / (B2+B4)) * 100                                                         |
| MEMP     | (((E134+E135) + (B2+B4-E128-E129-E130-E131-E132-<br>E133-E134-E135)) / (B2+B4)) * 100 |
| LPARCPU  | ( ((1/CPSP/1,000,000) * B0) / Interval in Seconds) * 100                              |

CPI – Cycles per Instruction

PRBSTATE - % Problem State

L1MP – Level 1 Miss %

L15P - % sourced from L1.5 cache

- L2LP % sourced from Level 2 Local cache (on same book)
- L2RP % sourced from Level 2 Remote cache (on different book)

MEMP - % sourced from Memory

LPARCPU - APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured

- B\* Basic Counter Set Counter Number
- P\* Problem-State Counter Set Counter Number

See "The Set-Program-Parameter and CPU-Measurement Facilities" SA23-2260-0 for full description

E\* - Extended Counters - Counter Number

See "IBM The CPU-Measurement Facility Extended Counters Definition for z10" SA23-2261-0 for full description

# Formulas – z10 Additional

| Metric              | Calculation – note all fields are deltas between intervals |
|---------------------|------------------------------------------------------------|
| Est Instr Cmplx CPI | CPI – Estimated Finite CPI                                 |
| Est Finite CPI      | ((B3+B5) / B1) * .84                                       |
| Est SCPL1M          | ((B3+B5) / (B2+B4)) * .84                                  |
| Rel Nest Intensity  | (1.0*L2LP + 2.4*L2RP + 7.5*MEMP) / 100                     |
| Eff GHz             | CPSP / 1000                                                |

#### Note these Formulas may change in the future

Est Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)

Est Finite CPI – Estimated CPI from Finite cache/memory

Est SCPL1M – Estimated Sourcing Cycles per Level 1 Miss

Rel Nest Intensity – Reflects distribution and latency of sourcing from shared caches and memory

Eff GHz - Effective gigahertz for GCPs, cycles per nanosecond

Workload Characterization L1 Sourcing from cache/memory hierarchy

- B\* Basic Counter Set Counter Number
- P\* Problem-State Counter Set Counter Number

See "The Set-Program-Parameter and CPU-Measurement Facilities" SA23-2260-0 for full description

# Formulas – z196

Workload Characterization L1 Sourcing from cache/memory hierarchy

| Metric   | Calculation – note all fields are deltas between intervals                                                          |
|----------|---------------------------------------------------------------------------------------------------------------------|
| CPI      | B0 / B1                                                                                                             |
| PRBSTATE | (P33 / B1) * 100                                                                                                    |
| L1MP     | ((B2+B4) / B1) * 100                                                                                                |
| L2P      | ((E128+E129) / (B2+B4)) * 100                                                                                       |
| L3P      | ((E150+E153) / (B2+B4)) * 100                                                                                       |
| L4LP     | ((E135+E136+E152+E155) / (B2+B4)) * 100                                                                             |
| L4RP     | ((E138+E139+E134+E143) / (B2+B4)) * 100                                                                             |
| MEMP     | (((E141+E142) + (B2+B4-E128-E129-E150-E153-E135-E136-E152-<br>E155-E138-E139-E134-E143-E141-E142)) / (B2+B4)) * 100 |
| LPARCPU  | ( ((1/CPSP/1,000,000) * B0) / Interval in Seconds) * 100                                                            |

**CPI – Cycles per Instruction** 

Prb State - % Problem State

L1MP – Level 1 Miss %

L2P – % sourced from Level 2 cache

L3P – % sourced from Level 3 on same Chip cache

L4LP – % sourced from Level 4 Local cache (on same book)

L4RP – % sourced from Level 4 Remote cache (on different book)

MEMP - % sourced from Memory

LPARCPU - APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured

- B\* Basic Counter Set Counter Number
- P\* Problem-State Counter Set Counter Number

See "The Set-Program-Parameter and CPU-Measurement Facilities" SA23-2260-0 for full description

E\* - Extended Counters - Counter Number

See expected "The CPU-Measurement Facility Extended Counters Definition for z10 and z196" SA23-2261-01 for full description

# Formulas – z196 Additional

| Metric              | Calculation – note all fields are deltas between intervals |
|---------------------|------------------------------------------------------------|
| Est Instr Cmplx CPI | CPI – Estimated Finite CPI                                 |
| Est Finite CPI      | ((B3+B5) / B1) * .63                                       |
| Est SCPL1M          | ((B3+B5) / (B2+B4)) * .63                                  |
| Rel Nest Intensity  | 1.6*(0.4*L3P + 1.0*L4LP + 2.4*L4RP + 7.5*MEMP) / 100       |
| Eff GHz             | CPSP / 1000                                                |

#### Note these Formulas may change in the future

Est Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)

Est Finite CPI - Estimated CPI from Finite cache/memory

Est SCPL1M - Estimated Sourcing Cycles per Level 1 Miss

Rel Nest Intensity –Reflects distribution and latency of sourcing from shared caches and memory

Eff GHz – Effective gigahertz for GCPs, cycles per nanosecond Workload Characterization L1 Sourcing from cache/memory hierarchy

- B\* Basic Counter Set Counter Number
- P\* Problem-State Counter Set Counter Number

See "The Set-Program-Parameter and CPU-Measurement Facilities" SA23-2260-0 for full description

# WSC Experiences Lessons Learned since March 2010

## Customer HiperDispatch Measurement

## Customer 1 MB Page Measurement

# CPU MF – Lessons Learned since March 2010

CPU MF Performance Metrics continues to help understand <u>why</u> performance changed

- LPAR Configuration Changes including
  - HD= Yes/No
- 1 MB Vs 4k Pages
- GHz measurement for State Changes
- Customers continue to successfully run CPU MF COUNTERS collecting SMF 113s
  - Over days/months without any reported performance impact, Turning on and leaving on
  - Volunteer Feedback: easy to enable, minimal time investment
- SMF 113 Logical CPU IDs are equal to the SMF 70 Logical CPU IDs
  - Directly identifies GCPs, zIIPs or zAAPs in SMF 113s with **APAR OA30486** for z10s and z196
- LPAR Management Time is NOT included in LPARCPU time (SMF 113 Cycles)
- Utilize the Counter Version Number fields to map to technology
  - SMF113\_2\_CTRVN2 Crypto or Extended counter sets = "2" for z196 "1" for z10
- z/VM CPU MF native prototype in process



CPU MF can help provide <u>cache/memory resource</u> change insights





HiperDispatch attempts to align Logical CPs with PUs in the same Book







#### From CPU MF, HiperDispatch=YES May Decrease the L2 Remote %



CPI - Cycles per Instruction

**PRBSTATE - % Problem State** 

L1MP – Level 1 Miss %

L15P - % sourced from L1.5 cache

L2LP - % sourced from Level 2 Local cache (on same book)

L2RP - % sourced from Level 2 Remote cache (on different book)

MEMP - % sourced from Memory

LPARCPU - APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured

Potential Workload Characterization z10 L1 sourcing from cache/memory hierarcy



## HiperDispatch=Yes Customer Improvement on z10 721

|      |      | Day      | Hour         | С | PI |            |              | Est Instr<br>Cmplx CPI | Est Finite<br>CPI | Est<br>SCPL1M | L1MP       | L15P         | L2LP         | L2RP       | MEMP | Rel Nest<br>Intensity | LPARCPU | HD ? |
|------|------|----------|--------------|---|----|------------|--------------|------------------------|-------------------|---------------|------------|--------------|--------------|------------|------|-----------------------|---------|------|
|      |      | 12<br>11 | 11.0<br>11.0 |   |    | 3.2<br>7.5 | 53.7<br>52.8 |                        | 4.45<br>3.71      | 115<br>97     | 3.9<br>3.8 | 63.7<br>70.3 | 23.1<br>19.8 | 6.7<br>3.7 |      |                       |         |      |
| HD=Y | es % | Impro    | vement       | t | 1. | 10         | 1.02         | 1.01                   | 1.20              | 1.19          | 1.01       | 0.91         | 1.17         | 1.80       | 1.05 | 1.17                  | 1.07    |      |

**CPI – Cycles per Instruction** 

Prb State - % Problem State

Est Instr Cmplx CPI – Estimated Instruction Complexity CPI (infinite L1)

Est Finite CPI – Estimated CPI from Finite cache/memory

Est SCPL1M – Estimated Sourcing Cycles per Level 1 Miss

L1MP – Level 1 Miss %

L15P – % sourced from Level 2 cache

L2LP – % sourced from Level 2 Local cache (on same book)

L2RP – % sourced from Level 2 Remote cache (on different book)

**MEMP - % sourced from Memory** 

Rel Nest Intensity – Reflects distribution and latency of sourcing from shared caches and memory

LPARCPU - APPL% (GCPs, zAAPs, zIIPs) captured and uncaptured

# HiperDispatch=YES resulted in a ~10% improvement as measured by CPU MF.

Additional measurements over multiple days from traditional CPU/Transaction metrics should be used to validate HD=No Vs. Yes results

- Partition has 21 logical processors
  - 2 additional partitions on the CEC

#### \*\*\* New - This is an evolving use of CPU MF \*\*\*

### CPU MF can help measure the impact of 1 MB Pages in your environment

|                         |      |          | Est Instr | Est Finite | Est    |      |       |      |      |      | Rel Nest  |         |     | TLB1 I<br>CPU% | of   | Cycl  | es   | PTE%<br>TLB1  | of all |
|-------------------------|------|----------|-----------|------------|--------|------|-------|------|------|------|-----------|---------|-----|----------------|------|-------|------|---------------|--------|
| Test                    | CPI  | PRBSTATE | Cmplx     | CPI        | SCPL1M | L1MP | L15P  | L2LP | L2RP | MEMP | Intensity | LPARCPU | GHz | Total 0        | CPU  | per M | liss | Misse         | s      |
| DB2 V10 4K PageFix=YES  | 4.46 | 1.29     | 2.63      | 1.83       | 26     | 7.13 | 94.72 | 4.64 | 0.01 | 0.63 | 0.09      | 28.2    | 4.4 |                | 16.0 |       | 83   | $\setminus$   | 19.2   |
| DB2 V10 1MB PageFix=YES | 4.26 | 1.13     | 2.58      | 1.68       | 23     | 7.25 | 96.56 | 3.03 | 0.01 | 0.41 | 0.06      | 33.9    | 4.4 |                | 15.6 | ) (   | 65   | 11            | 13.7   |
| -                       | 1.05 |          |           |            |        | 0.98 | 0.98  | 1.53 |      |      |           |         |     |                | 1.03 | / /   | 1.28 | $\mathcal{I}$ | 1.40   |
|                         |      |          |           |            |        |      |       |      |      |      |           |         |     |                |      |       |      |               |        |

- DB2 10 for z/OS Beta provides ability to specify 1 MB Pages for DB2 Buffer Pools
- 1 MB Pages can help reduce TLB Page Table Entry Misses
- CPU MF can be used to help measure the 1 MB Page impact for your environment
  - DB2 10 for z/OS Beta Customer ran DB2 Batch job that exercised 4k and 1MB pages (PageFix=Yes). LFArea=40M
    - The batch job executed 30M Selects, 20M Inserts, and 10M Fetchs
  - CPU MF showed the following but this is not necessarily representative of 1 MB Page results
    - 40% reduction in Page Table Entry % (PTE) of all TLB1 Misses
    - 28% reduction TLB1 Cycles per Miss, 3% reduction TLB1 Miss CPU% of Total CPU
    - Lower CPI and Nest Intensity
    - DB2 Accounting report showed 1.4 % reduction in CPU time

Warning: These numbers come from a synthetic Benchmark and do not represent a production workload

- As you implement 1 MB Page exploiters, use CPU MF to help measure the impact
  - Measure it in its intended Production LPAR
- See white paper "IBM System z10 Support for large pages"
  - http://www.research.ibm.com/journal/abstracts/rd/531/tzortzatos.html



### DB2 10 for z/OS Beta Customer – RMF for 1 MB Page

|                                                   |                                 |                                  |                         |                         | PAGI        | NG AC     | ΤΙΥΙΥΥ        |                                                             |                  |                       | PAGE             | 2 |
|---------------------------------------------------|---------------------------------|----------------------------------|-------------------------|-------------------------|-------------|-----------|---------------|-------------------------------------------------------------|------------------|-----------------------|------------------|---|
| Z/OS V:<br>T = IEAOPTXX M                         | MODE = ESA                      | ME                               | CE                      | NTRAL                   | STORAGE MOV | EMENT RAT | TES - IN PAGE | 12.45.00 INTERVA<br>13.00.01 CYCLE 5<br>5 PER SECOND        |                  | s                     | FAGE             |   |
| HIGH UIC (AVG) :                                  | = 65535<br>WRITTE<br>CENTRAL    | (MAX) =<br>EN TO<br>STOR C       | 65535<br>READ<br>ENTRAL | (MIN)<br>FROM<br>STOR   | = 65535     | CENTRAL   |               | E COUNTS*                                                   |                  |                       |                  |   |
| PAGES<br>VIO RT<br>PAGES                          |                                 | 0.00                             |                         | 0.00                    |             | 0         | 0             | 0                                                           |                  |                       |                  |   |
|                                                   |                                 |                                  |                         |                         |             | AND SLOT  | COUNTS        |                                                             |                  |                       |                  |   |
|                                                   | CENT                            | RAL STORA                        | GE                      |                         |             |           |               | LOCAL PA                                                    | GE DATA SET      | SLOT COUNT            | s                |   |
| (15 SAMPLES)                                      | MIN                             | MAX                              |                         | AVG                     |             |           |               |                                                             | MIN              | MAX                   | AVG              |   |
| AVAILABLE<br>SQA                                  | 158,574<br>10,497<br>5,734      | 10,595                           | 10.                     | 529                     |             |           |               | AVAILABLE SLOTS<br>VIO SLOTS                                | 2,854,758<br>0   | 2,854,758<br>0        | 2,854,758<br>0   |   |
| ČSA<br>LSQA                                       | 39.739                          | 39,921<br>15,198                 | 39,                     | 850                     |             |           |               | NON-VIO SLOTS                                               |                  |                       |                  |   |
| REGIONS+SWA<br>TOTAL FRAMES                       | 539,686<br>786,432<br>FI        | 542,913<br>786,432<br>IXED FRAME | 541,<br>786,            | 822<br>432              |             |           |               | BAD SLOTS<br>TOTAL SLOTS<br>SHARE                           | D FRAMES AN      | D SLOTS               |                  |   |
| NUCLEUS                                           | 2,608<br>9,636                  | 2,608                            | 2,                      | 608                     |             |           |               | CENTRAL STORAGE                                             | 6,428            | 6,557                 | 6,489            |   |
| LPA<br>CSA<br>LSQA<br>REGIONS+SWA                 | 94<br>1,550<br>14,324<br>49,347 | 94<br>1,550<br>14,334<br>49,392  | 1,<br>14,<br>49,        | 94<br>550<br>331<br>359 |             |           |               | FIXED TOTAL<br>FIXED BELOW 16 M<br>AUXILIARY SLOTS<br>TOTAL | 98<br>0<br>8,389 | 98<br>0<br>0<br>8,518 | 98<br>0<br>8,450 |   |
| BELOW 16 MEG<br>BETWEEN 16M-2G<br>TOTAL FRAMES    | 13,456                          | 13,498                           | 13,                     | 467                     |             |           |               | MEMOR                                                       | Y OBJECTS A      | ND FRAMES             |                  |   |
| INTRE PROPER                                      |                                 | REQUEST                          |                         |                         |             |           |               | OBJECTS COMMON<br>SHARED<br>LARGE                           | 3<br>6<br>40     | 3<br>6<br>40          | 3<br>6<br>40     |   |
| GETMAIN REQ<br>FRAMES BACKED                      | 0                               |                                  |                         |                         |             |           |               | FRAMES COMMON<br>COMMON FIXED                               |                  | 3,811                 | 3,801            |   |
| FIX REQ < 2 GB<br>FRAMES < 2 GB<br>REF FAULTS 1ST | 00                              |                                  |                         |                         |             |           | <             | 1 MB                                                        | 7,504<br>40      | 40                    | 1                |   |

# Formulas – Additional TLB

| Metric – z10                                  | <b>Calculation</b> – note all fields are <b>deltas</b> between intervals |
|-----------------------------------------------|--------------------------------------------------------------------------|
| TLB1 CPU Miss % of Total CPU                  | ( (E145+E146) / B0) * 100                                                |
| TLB1 Cycles per TLB Miss                      | (E145+E146) / (E138+E139)                                                |
| PTE % of all TLB1 Misses                      | (E140 / (E138+E139) ) * 100                                              |
|                                               |                                                                          |
| Metric – z196                                 | <b>Calculation</b> – note all fields are <b>deltas</b> between intervals |
| Metric – z196<br>TLB1 CPU Miss % of Total CPU |                                                                          |
|                                               | between intervals                                                        |

#### Note these Formulas may change in the future

TLB1 CPU Miss % of Total CPU - TLB CPU % of Total CPU

TLB1 Cycles per TLB Miss – Cycles per TLB Miss

PTE % of all TLB1 Misses – Page Table Entry % misses

B\* - Basic Counter Set - Counter Number

See "The Set-Program-Parameter and CPU-Measurement Facilities" SA23-2260-0 for full description

E\* - Extended Counters - Counter Number

See "IBM The CPU-Measurement Facility Extended Counters Definition for z10" SA23-2261-0 for full description or "The CPU-Measurement Facility Extended Counters Definition for z10 and z196" SA23-2261-01 for full description

# z10 and z196 CPU MF COUNTERS Summary

- Traditional metrics continue to provide the best view of Performance
  - CPU MF can help explain <u>why</u> a change occurred
- First Step completed in Workload Characterization for Capacity Sizing
  - Relative Nest Intensity calculation today gives a hint to zPCR
- Volunteers are still needed for our Workload Characterization study for refinement
  - Feedback from Volunteers is this is very easy to enable, with a minimal time investment
- CPU MF has a very low overhead to run and is easy to implement
  - Less than 1/100 of a second for HIS address space in 15 minute interval
  - Customers are successfully running CPU MF in Production Today
- Recommend enabling CPU MF COUNTERS on z10s and z196s today!
  - To supplement current performance metrics (e.g. from SMF, RMF, DB2, CICS), turn on and leave on
  - APAR OA30486 required for z196s and recommended for z10s
- CPU MF Overview and WSC Experiences Techdoc TC000041
  - http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TC000041
  - CPU MF presentation and a <u>detailed write up for enabling CPU MF</u>









# Acknowledgements

### Many people contributed to this presentation including:

Riaz Ahmad

Greg Boyd

Jane Bartik

Harv Emery

Gary King

Frank Kyne

Steve Olenik

Bob Rogers

**Bill Schray** 

**Brian Smith** 

Bob St John

Elpida Tzortzatos

Kathy Walsh

# Disclaimer





Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.





# Thank You for attending!

# Appendix

# CPU MF – Lessons Learned since August 2009

- CPU MF Performance Metrics can be used to help understand <u>why</u> performance changed
- Customers are successfully running CPU MF COUNTERS collecting SMF 113s
  - Over days and months without any reported performance impact
  - Feedback from Volunteers is this is very easy to enable, with a minimal time investment
- SMF 113 Logical CPU IDs are equal to the SMF 70 Logical CPU IDs
  - Can match up SMF 113s & SMF 70s to identify GCPs, zIIPs or zAAPs
  - Can see the unique Vertical Polarity Logical CPs cache/memory characteristics
    - E.G. Vertical Mediums may have higher L2 Remote activity
- In multi-book z10 ECs there can be L2 Remote Activity even if <=12 GCPs</p>
  - Because of I/O activity from SAPs as the data is initially stored in the Remote L2

#### Utilize the Counter Version Number fields to map to technology

- Number is increased for a change in meaning or number of counters
  - SMF113\_2\_CTRVN1 Basic or Problem-State counter sets
  - SMF113\_2\_CTRVN2 Crypto or Extended counter sets

## CPU MF Update – Lessons Learned since March 2009

- L1 Miss % can be determined from CPU MF COUNTERS
- z10 EC must be at bundle #20 or higher for CPU MF COUNTERS
- IRD considerations
  - If CPU goes offline, only activity within internal is recorded in an Intermediate record, then
    - If no activity in follow on 15 minute interval(s), Intermediate record is not cut for the CPUID
      - No Final record when HIS is ended
    - When activity resumes, Intermediate record is written for CPUID
- New APAR OA27623 to add "CPU Speed" to SMF 113 and to HIS COUNTERS output
  - Processor speed for which the hardware event counters are recorded. Speed is in cycles / microsecond - "4404" for z10 EC
  - SMF 113 new field: SMF113\_2\_CPSP 4 byte binary
  - Simplifies conversion of Cycles into "Time"
- Customers are successfully running CPU MF COUNTERS (and collecting SMF 113s) over 24 hours
- Analyze the "major" LPARs on a z10 at the same time



# Documentation

- *MVS Commands* SA22-7627-19
  - Setting up hardware event data collection 1-39
- The Set-Program-Parameter and CPU-Measurement Facilities SA23-2260-0
  - Full description of Basic, Problem-State and Crypto Counter Sets
- IBM The CPU-Measurement Facility Extended Counters Definition for z10 SA23-2261-0
- IBM The CPU-Measurement Facility Extended Counters Definition for z10 and z196 SA23-2261-01
- WSC Short Stories and Tall Tales
  - SHARE Summer 2009 Denver Session 2136 John Burg
- CPU MF Overview and WSC Experiences Techdoc TC000041 available March 26 2010
  - http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TC000041
  - SHARE Winter 2010 presentation and detailed write up for enabling CPU MF John Burg
- ITSO Red Book reference Planned for 4QT 2010
  - *Exploiting System z LPAR Capacity Controls* SG24-7846. 2 Part Book:
    - Part 1 CPU MF
    - Part 2 HiperDispatch, Group Capacity Controls, hard/soft capping
    - Draft available ~April 1

http://www.redbooks.ibm.com/redbooks.nsf/home?ReadForm&page=drafts

HISU19I EVENT COUNTERS INFORMATION VERSION 1

### APAR OA27623 "CPU Speed" – HIS COUNTERS Output

FILE NAME: SYSHIS20090615.112833.CNT COMMAND: MODIFY HIS, B, TT='UA46797', CTRONLY, CTR=ALL COUNTER VERSION NUMBER 1: 1 COUNTER VERSION NUMBER 2: 1 COUNTER SET= BASIC COUNTER IDENTIFIERS: 0: CYCLE COUNT 1: INSTRUCTION COUNT 2: L1 I-CACHE DIRECTORY-WRITE COUNT 3: L1 I-CACHE PENALTY CYCLE COUNT 4: L1 D-CACHE DIRECTORY-WRITE COUNT 5: L1 D-CACHE PENALTY CYCLE COUNT START TIME: 2009/06/15 11:28:33 START TOD: C4574FEC19DF7217 END TIME: 2009/06/15 12:19:09 END TOD: C4575B3B6919C911 COUNTER VALUES (HEXADECIMAL) FOR (PU 00 (CPU SPEED = 4404 CYCLES/MIC) 0- 3 00000017978F0641 00000004435EC932 00000000C3DB63E 000000014038D222 4- 7 0000000223375DD 00000004F5D256E8 ----START TIME: 2009/06/15 11:28:33 START TOD: C45/4FEC19E10D9 END TIME: 2009/06/15 12:19:09 END TOD: C4575B3B691AE091 COUNTER VALUES (HEXADECIMAL) FOR CPU 05 (CPU SPEED = 4404 CYCLES/MIC) 0- 3 00000016D275AAA9 00000004395C24A6 00000000C2E714E 000000019E57EBE0 4- 7 0000000219A39DC 0000004E4C3881F -----START TIME: 2009/06/15 11:28:33 START TOD: C4574FEC19E29817 END TIME: 2009/06/15 12:19:09 END TOD: C4575B3B691B8C11 COUNTER VALUES (HEXADECIMAL) FOR CPU 0A (CPU SPEED = 4404 CYCLES/MIC) 0- 3 000000002803BE2 000000000889237 0000000000093D 00000000005B310 4- 7 000000000021461 000000001D9D453 -----START TIME: 2009/06/15 11:28:33 START TOD: C45/4FEC19E43D97 END TIME: 2009/06/15 12:19:09 END TOD: C4575B3B691C7411 COUNTER VALUES (HEXADECIMAL) FOR CPU OB (CPU SPEED = 4404 CYCLES/MIC 0- 3 000000002513682 0000000001692C2 0000000000F3FE 0000000095A685 4- 7 00000000002092A 000000001D32119 -------START TIME: 2009/06/15 11:28:33 START TOD: C4574FEC19E58997 END TIME: 2009/06/15 12:19:09 END TOD: C4575B3B691D5311 COUNTER VALUES (HEXADECIMAL) FOR CPU OC (CPU SPEED = 4404 CYCLES/MIC) 4- 7 000000000020BEF 000000001AFF518 -----START TIME: 2009/06/15 11:28:33 START TOD: C4574FEC19E73D9 END TIME: 2009/06/15 12:19:09 END TOD: C4575B3B691E2E91 COUNTER VALUES (HEXADECIMAL) FOR CPU 0D (CPU SPEED = 4404 CYCLES/MIC) 0- 3 0000000021ADEE1 000000000169152 00000000000858 0000000644954

> These numbers come from a synthetic Benchmark and do not represent a production workload



# How it works

### Hardware Instrumentation Counters



# What data is in the CPU MF – per Logical CP

- Basic Counters (and Problem) per CPU (1)
  - Cycles
  - Instructions
  - L1 Cache Sourcing basic information
- Crypto Counters per CPU (1)
  - Counts and Cycles by Crypto function
- Extended Counters per CPU (Model Dependent) (2)
  - Cache Hierarchy Information and more
    - z10 L1 Sourcing detailed information -
- 1 See "The Set-Program-Parameter and CPU-Measurement Facilities" SA23-2260-0 for full description
- 2 See "IBM The CPU-Measurement Facility Extended Counters Definition for z10" SA23-2261-0 for full description

See Appendix for Basic, Problem and Crypto Counters

z10 L1 Cache

Hierarchy

Sourcing



## What we did

- Set up CPU MF on WSC z10 and z/OS 1.10
- Started/Modified HIS and collected SMF 113s and \*.CNT Data
  - Ran "COUNTERS" mode, COUNTERS=ALL (Basic, Problem, Crypto, Extended) via:

#### - "F HIS,B,TT='EncrypCounters2',PATH='/his/',CTRONLY,CTR=ALL"

#### Ran DASD dumps

- DASD dumps sequentially over 20 minute duration
- With option: ENCRYPT(CLRTDES) -

#### Built sample reports with a REXX exec

- Used \*.CNT output to as input
- Validated with SMF 113s
- Reports
  - Basic Counters
  - Basic / Extended Counters z10 L1 Cache Hierarchy Sourcing Report
  - Crypto Counters



- HIS019I EVENT COUNTERS INFORMATION
- FILE NAME: SYSHIS20090207.161102.CNT
- COMMAND: MODIFY HIS, B, TT='EncrypCounters2', PATH='/his/, CTRONLY, CTR=ALL
- COUNTER VERSION NUMBER 1: 1 COUNTER VERSION NUMBER 2: 1
- **COUNTER SET= BASIC** Description COUNTER IDENTIFIERS: 0: CYCLE COUNT 1: INSTRUCTION COUNT 2: L1 I-CACHE DIRECTORY-WRITE COUNT 3: L1 I-CACHE PENALTY CYCLE COUNT 4: L1 D-CACHE DIRECTORY-WRITE COUNT 5: L1 D-CACHE PENALTY CYCLE COUNT Start / End time START TIME: 2009/02/07 16:11:02 START TOD: C3B6ADBE7AD83D26 END TIME: 2009/02/07 16:31:19 END TOD: C3B6B24700FC45A5 Counters per CPU - 00 COUNTER VALUES (HEXADECIMAL) FOR CPU 00: 0- 3 0000004689BEBF20 0000000433831366 0000000014CF0790 000000021B57E0D8 4- 7 00000002A620C97 000000B25C43DBC ------START TIME: 2009/02/07 16:11:02 START TOD: C3B6ADBE7AD95826 END TIME: 2009/02/07 16:31:19 END TOD: C3B6B24700FD3625 Counters per CPU - 01 COUNTER VALUES (HEXADECIMAL) FOR CPU 01: 0- 3 00000048CFB22F1D 000000048D23D49A 0000000154D89E5 0000000229B662EA 4- 7 00000002C1F067B 000000B8087F6A7 ------START TIME: 2009/02/07 16:11:02 START TOD: C3B6ADBE7ADABCA6 END TIME: 2009/02/07 16:31:19 END TOD: C3B6B24700FE1525 Counters per CPU - 04 COUNTER VALUES (HEXADECIMAL) FOR CPU 04: 0- 3 00000021DE76A328 0000000A8F16E5E9 00000000022392 0000000008AC8F2 4- 7 00000001B92F07B 00000035E926CFD ------COUNTER SET= PROBLEM-STATE COUNTER IDENTIFIERS: 32: PROBLEM-STATE CYCLE COUNT 33: PROBLEM-STATE INSTRUCTION COUNT 34: PROBLEM-STATE L1 I-CACHE DIRECTORY-WRITE COUNT 35: PROBLEM-STATE L1 I-CACHE PENALTY CYCLE COUNT 36: PROBLEM-STATE L1 D-CACHE DIRECTORY-WRITE COUNT 37: PROBLEM-STATE L1 D-CACHE PENALTY CYCLE COUNT





# Sample Report – Basic Counters

\*\*\* Z10 Summary - BASIC Counters Information \*\*\* \*\*\* TOTAL for all CPUs \*\*\*

Cycle Count625429033.94/secInstruction Count68153013.72/secL1 I-Cache Directory-Write Count580653.65/secL1 D-Cache Directory-Write Count15076029.05/secL1 D-Cache Directory-Write Count1572649.35/secL1 D-Cache Penalty Cycle Count91824855.27/sec

Total z10 Busy : 4.79% - for the 3 CPUs

Normalized Basic Counters to per Second L1 Index and Directory Write Counts used In Cache Hierarchy Sourcing

#### L1 Miss % can be derived from CPU MF information

•Instruction Count is the base. If instructions are not in z10 L1 Cache, then they must be "Sourced" from the z10 hierarchy. The Total "Sourced" is the Total Write Count, the "Misses"

•L1 Miss % = Directory Write Counts (I+D) / Instruction Counts

 $\bullet 3.2\% = (580,653.65 + 1,572,649.35) / 68,15,013.72$ 

These numbers come from a synthetic Benchmark and do not represent a production workload



Advanced Technical Skills

### Sample Report – Basic / Extended Counters z10 L1 Cache Hierarchy Sourcing

| 背背背                                           | z10 Summ<br>L1 Cache                                                                         | <b>Τ</b> (                                           | DTAL fo                                                     | or all                                               | CPUS                                                           |                            |                                     |                                   |             | ***<br>***<br>***                                               |                             |                                 |                                              |
|-----------------------------------------------|----------------------------------------------------------------------------------------------|------------------------------------------------------|-------------------------------------------------------------|------------------------------------------------------|----------------------------------------------------------------|----------------------------|-------------------------------------|-----------------------------------|-------------|-----------------------------------------------------------------|-----------------------------|---------------------------------|----------------------------------------------|
| Sour                                          | ce for L                                                                                     | .1                                                   |                                                             |                                                      |                                                                |                            |                                     |                                   |             |                                                                 | %                           | RATE                            | Unit                                         |
| Dir<br>Dir<br>Dir<br>Dir<br>Dir<br>Dir<br>Dir | Write L1<br>Write L1<br>Write L1<br>Write L1<br>Write L1<br>Write L1<br>Write L1<br>Write L1 | Data<br>Inst<br>Data<br>Inst<br>Data<br>Inst<br>Data | Cache<br>Cache<br>Cache<br>Cache<br>Cache<br>Cache<br>Cache | from<br>from<br>from<br>from<br>from<br>from<br>from | L1.5<br>L2 on<br>L2 on<br>L2 NOT<br>L2 NOT<br>Memory<br>Memory | on<br>on<br>on<br>on<br>on | BOO<br>same<br>same<br>same<br>same | k<br>Bool<br>Bool<br>Bool<br>Bool | c<br>c<br>c | 26.40<br>52.40<br>0.54<br>20.16<br>0.00<br>0.22<br>0.00<br>0.00 | 1%<br>%<br>%<br>%<br>%<br>% | 4703.85<br>4.05<br>63.44        | /Sec<br>/Sec<br>/Sec<br>/Sec<br>/Sec<br>/Sec |
|                                               | Write L1<br>Write L1<br>al                                                                   |                                                      |                                                             |                                                      |                                                                |                            |                                     |                                   | Book        |                                                                 | %                           | 455.48<br>5392.17<br>2153303.00 | /sec                                         |

Various Sources from Extended Counters

Total L1 Sourcing from Basic Counters ← C

> These numbers come from a synthetic Benchmark and do not represent a production workload

### CPU MF and HIS provide a z/OS logical view of z10 Resource Usage and Cache Hierarchy Sourcing





### Sample Report – Crypto Counters

| PRNG PRNG PRNG PRNG PRNG PRNG PRNG PRNG | function<br>cycle co<br>blocked<br>locked<br>locked f<br>locked f | TOTAL<br>ount<br>Function<br>Cycle C<br>Count<br>unt<br>Function<br>Cycle Co<br>Count<br>Function<br>Cycle Co<br>Count | Count<br>Count<br>Count<br>Count<br>Count | our<br>1 c | nters<br>PUs      |                   |     | 0.<br>592.<br>6277.  | 0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec<br>0/Sec |  |
|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|------------|-------------------|-------------------|-----|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------|--|
| 官官官                                     |                                                                                                                                                                               | CRYPT                                                                                                                  | O BUSY                                    | SUN        | MAR               | (                 |     |                      | 食食食                                                                                                                                          |  |
| DEA<br>AES                              | crypto                                                                                                                                                                        | Busy:<br>Busy:<br>Busy:                                                                                                | 0.00%<br>0.00%<br>2.55%<br>0.00%<br>2.55% | -          | for<br>for<br>for | the<br>the<br>the | 333 | CPUS<br>CPUS<br>CPUS |                                                                                                                                              |  |

This information may be useful in determining:

•When and What encryption function is occurring (Count)?

•How many cycles are being used?

The encryption facility executed both SHA functions and TDES functions for this specific test.

Since CPU MF is new, this information is not available from RMF today

Need to analyze more Customer data

These numbers come from a synthetic Benchmark and do not represent a production workload



**i** 

# Image Profile Security Customization for HIS

| TSYSHMC: Customize/Delete Activation Profiles - Mozilla Fi | refox |
|------------------------------------------------------------|-------|
|------------------------------------------------------------|-------|

https://9.82.36.91/hmc/wcl/T2867#W2860\_treeSel

| Customize Image P                | rofiles: TSYS:TOSP2 : TOSP2 : Security                                                                                                                                                                                                                                                            |  |
|----------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| - TSYS:TOSP2                     | Partition Security Options                                                                                                                                                                                                                                                                        |  |
| General<br>Processor<br>Security | <ul> <li>☑ Global performance data control</li> <li>☑ Input/output (I/O) configuration control</li> <li>☑ Cross partition authority</li> <li>☑ Logical partition isolation</li> </ul>                                                                                                             |  |
| Storage                          | Counter Facility Security Options                                                                                                                                                                                                                                                                 |  |
| Crypto                           | <ul> <li>Basic counter set authorization control</li> <li>Problem state counter set authorization control</li> <li>Crypto activity counter set authorization control</li> <li>Extended counter set authorization control</li> <li>Coprocessor group counter sets authorization control</li> </ul> |  |
|                                  | - Sampling Facility Security Options                                                                                                                                                                                                                                                              |  |
|                                  | Basic sampling authorization control                                                                                                                                                                                                                                                              |  |
|                                  |                                                                                                                                                                                                                                                                                                   |  |
| Save Copy Profile Paste          | Profile Assign Profile Cancel Help                                                                                                                                                                                                                                                                |  |
| State and the second states      |                                                                                                                                                                                                                                                                                                   |  |

# **Counter Data**

#### Basic Counter Set

- Cycle count
- Instruction count
- Level-1 I-cache directory write count
- Level-1 I-cache penalty cycle count
- Level-1 D-cache directory write count
- Level-1 D-cache penalty cycle count

#### Problem State Counter Set

- Problem state cycle count
- Problem state instruction count
- Problem state level-1 I-cache directory write count
- Problem state level-1 I-cache penalty cycle count
- Problem state level-1 D-cache directory write count
- Problem state level-1 D-cache penalty cycle count

#### Extended Counter Set

- Number and meaning of counters are model dependant



### **Counter Data**

#### Crypto Activity Counter Set (CPACF activity)

- PRNG function count
- PRNG cycle count
- PRNG blocked function count
- PRNG blocked cycle count
- SHA function count
- SHA cycle count
- SHA blocked function count
- SHA blocked cycle count
- DES function count
- DES cycle count
- DES blocked function count
- DES blocked cycle count
- AES function count
- AES cycle count
- AES blocked function count
- AES blocked cycle count

## SMF Record type 113, subtype 2

Layout: (SMF manual, HISYSMFR macro)

- Standard SMF record header ('1C'x bytes)
- SMF record control information
  - $\succ$  TOD when SMF record is written, etc.
  - >Offset, length, and number of data sections
- Data section
  - >TOD when counter data was captured
  - CPU number
  - >Offset, length, and number of Counter Set Sections
  - Offset, length, and number of Counter Sections
  - Counter Set Sections
    - Counter Set type (1=BASIC, 2=PROB, 3=CRYPTO, 4=EXT)
    - Bit mask identifying the counters being recorded in array
      - ✓ e.g. 'FC0000000000000'x => counters 0-5 are valid
  - Counter Sections 8-byte counter values (contiguous)



### z10 versus z9 hardware comparison

#### z9 EC

#### CPU

- 1.7 Ghz
- superscalar
- Caches
  - -L1 private 256k i, 256k d
  - L2 shared 40 mbs / book
  - book interconnect: ring

#### z10 EC

#### ► CPU

- -4.4 Ghz
- redesigned pipeline
- superscalar
- Caches
  - -L1 private 64k i, 128k d
  - -L1.5 private 3 mbs
  - L2 shared 48 mbs / book
  - -book interconnect: star







# Usage & Invocation - Additions to the .CNT file

# The .CNT file adds a new line to describe the state (new version identifier)

```
    When a state change was detected and STATECHANGE=STOP
```

HIS019I EVENT COUNTERS INFORMATION VERSION 2

FILE NAME: SYSHISyyyymmdd.hhmmss.000.CNT

COMMAND: MODIFY HIS, xxxx

```
STATE CHANGE: YES, STOP
```

COUNTER VERSION NUMBER 1: XXXX COUNTER VERSION NUMBER 2: XXXX

When a state change was detected and STATECHANGE=IGNORE

HIS019I EVENT COUNTERS INFORMATION VERSION 2

FILE NAME: SYSHISyyyymmdd.hhmmss.000.CNT

COMMAND: MODIFY HIS, xxxx

STATE CHANGE: YES, IGNORE

COUNTER VERSION NUMBER 1: XXXX COUNTER VERSION NUMBER 2: XXXX

When a state change was detected and STATECHANGE=SAVE

HIS019I EVENT COUNTERS INFORMATION VERSION 2

FILE NAME: SYSHISyyyymmdd.hhmmss.000.CNT

#### COMMAND: MODIFY HIS, xxxx

```
STATE CHANGE: YES, SAVE
```

COUNTER VERSION NUMBER 1: xxxx COUNTER VERSION NUMBER 2: xxxx

When no state change was detected

HIS0191 EVENT COUNTERS INFORMATION VERSION 2

FILE NAME: SYSHISyyyymmdd.hhmmss.000.CNT

COMMAND: MODIFY HIS, xxxx

#### STATE CHANGE: NO

COUNTER VERSION NUMBER 1: XXXX COUNTER VERSION NUMBER 2: XXXX